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Abstract 

A family of four-stage third-order explicit Runge-Kutta schemes is derived that requires only two 
storage locations and has desirable stability characteristics. Error control is achieved by embedding 
a second-order scheme within the four-stage procedure. Certain schemes are identified that are 
as efficient and accurate as conventional embedded schemes of comparable order and require fewer 
storage locations. 


Section 1: Introduction 


Runge-Kutta (RK) embedding techniques are an effective means of solving non-stiff ordinary dif- 
ferential equations (ODE’s). (See references [1], [2], [3], [4] for examples of high-order RK schemes 
that utilize embedding.) Embedding utilizes two formulas of orders p and q (p ^ q ) to calculate 
the evolution of the solution in time. By comparing the two solutions at each time step, an estimate 
of the temporal error can be determined and can be used to adjust the time step. For example, 
if q = p -f 1 ? then the difference between the pth- and ^th-order solutions provides a measure 
of the error committed in using formula p. The two solutions cannot be advanced without com- 
putational overhead; however, this overhead can be minimized by requiring both formulas to have 
similar coefficients and storage requirements. In a best-case scenario, no additional work or storage 
is necessary for the implementation of the embedded scheme. To date, most embedded schemes have 
been optimized in terms of accuracy and efficiency, with little regard to storage requirements. 

For the ODE’s that result from the semidiscretization of partial differential equations (PDE’s) 
(fluid mechanics, for example), the overriding consideration is often the availability of fast memory. 
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A numerical integration technique that minimizes memory storage is essential and can be formulated 
with a RK methodology. Williamson [5] showed that all second-order and some third-order explicit 
RK schemes can be cast in a 2N-storage format, where N is the dimension of the system of ODE’s. 
He also showed that the four-stage fourth-order RK schemes cannot be put into 2N-storage format. 
Carpenter et al. [6] showed that a fourth-order 2N-storage scheme could be achieved with five 
stages (this scheme is abbreviated RK[5,4]-2N). In addition, they showed that tuned four-stage third- 
order schemes (RK[4,3]-2N) are significantly more efficient than conventional three-stage third-order 
schemes (RK[3,3]-2N) for certain problems. 

The principal motivation of this work is to derive a family of schemes that requires 2N storage 
and that has the capability to monitor temporal integration error. Specifically, a four-stage third- 
order RK scheme that satisfies the 2N-storage constraint is sought (RK[4,3]-2N). To accomplish error 
control, a three-stage second-order 2N-storage RK scheme (RK[3,2]-2N) is embedded within the first 
three stages. (The resulting scheme is abbreviated RK[4,3(2)]-2N.) The family is then tuned so that 
a desirable stability envelope is achieved for the four-stage scheme. In section 2, we describe the 
conventional and the 2N-storage RK nomenclature. In section 3, we derive a new family of four-stage 
third-order 2N-storage schemes with an embedded formula. We then optimize the family for maximal 
stability characteristics. In section 4, the time-step control procedure is described. In section 5, the 
efficacy of the new schemes is demonstrated on several test problems, and then conclusions are drawn 
in regard to the utility of the new schemes. 

Section 2: Runge-Kutta Nomenclature 


We are concerned with the numerical solution of the initial value problem 

^ U(t 0 ) = Vo 

Assume that the discrete approximation is made with an M - stage explicit RK scheme which includes 
an embedded scheme within the M - stage procedure. The implementation over a time step h is 
accomplished by 
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where U° = U(t 0 ) and U 1 and U l are the solutions at time level n + 1 of order p and q, respectively. 
The fixed scalars a,j, 6 j, 6 J; c, are the coefficients of the RK formula and, for a four-stage third-order 
scheme, must satisfy the equations [7] 
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The last two constraints in equation (1) ensure second-order accuracy of the three-stage embedded 
scheme. 


To devise low-storage RK schemes, Williamson [5] exploits the technique of leaving useful in- 
formation in the storage register. Each successive stage is written onto the same register without 
zeroing the previous value. Thus, the Af -stage algorithm becomes 

dU 3 = AjdUj. i + hF{Uj) 

Uj = Uj-i + BjdUj j = 1,M 

So that the algorithm is self-starting, Ai = 0. Only the dU and U vectors must be stored, which 
results in a 2N-storage algorithm. 


The following relations, first presented by Williamson, [5] describe the dependence between the 
2N-storage variables Aj and Bj and the conventional RK variables a,j, bj, and c,: 


Bj — G j+i,j 

Bm = f>M 

A-i = (bj-i ~ Bj-i)/bj 
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The precise values of Aj and Bj that are required to yield a higher order scheme remain to be 
determined. 

In terms of the Butcher array (see reference [7] for details), the relationship between the conven- 
tional RK scheme and the 2N-storage RK scheme can be expressed as 
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Note that this form assumes four explicit stages, with the coefficients a 4 j, j = 1,4 replaced by 

h,j = M. 


Section 3 : Four-Stage Third-Order RK Schemes With Embedding 


The solution to the four-stage third-order RK scheme is formed by solving eight equations in 
fourteen variables and is, in general, a six-parameter family. The variables a, tJ , bj,cj are changed to 
Aj,Bj using the relationships defined in equations ( 3 ). Specifically, the values 02,1, (13,2, bj. b 4 , £> 3 , 
& 2 , and bi are expressed in terms of the values B 4 , B2, B3, B 4 , A 4 , A3, and A2, respectively. By 
definition, the conditions c, = £* =1 a,j will automatically be satisfied. The only conditions that are 
not immediately satisfied involve a 3jl , b\, and b 2 . These three conditions provide three additional 
constraints on the system, and eliminate three of the six degrees of freedom. If a three-stage second- 
order scheme is embedded, then two additional constraints are provided and the number of degrees 
of freedom is reduced to one. A solution that involves one free parameter can be written in Butcher 
array form as 
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where X = (12 c 3 3 - 24 c 3 2 + 16 c 3 - 3 ) and Y = (6 c 3 2 - 6 c 3 + 1). Written in 2N-storage form, 
the arrays Aj and B 3 become 
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We now use efficiency and accuracy considerations to isolate specific values of C3 that exhibit 
desirable characteristics. The linear stability of the four-stage third-order schemes is governed by 
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G = l+Z+|Z 2 +|Z 3 +^j fc=1 biaijdjkCkZ 4 . (See Butcher [7] for details.) Because J2tj,k=i b,a,ja jk Ck = 
_ (2c? 1 ) 2 ^ — i8c 3 +7) ajj va i ues D f C3) the linear stability envelope is found by solving |G| <1, 

where 


G = 1 + Z + ^Z 2 + ^Z 3 - 
l 0 


(2 — 1) eg (12 c^ 2 — 18 eg + 7) 4 

12XF 


Carpenter et al. [6] reported that the four-stage third-order RK schemes (RK[4,3j) have desirable 
stability characteristics (for use with hyperbolic PDE’s) for J2tj,k = i ^i a >j a jkCk < with optimal 
values near jj. For values greater than the RK[4,3] schemes abruptly become unstable. For 
values less than a gradual decrease in stability occurs. Figure 1 illustrates this behavior by 
showing the stability of the scalar hyperbolic equation u t + u x = 0. The spatial derivative u x is 
approximated with a periodic sixth-order compact spatial operator. The RK[4,3] schemes are used 
as the temporal discretization. In this test problem, the spatial eigenvalues are purely imaginary and 
are representative of the behavior of many spatial operators used for hyperbolic PDE’s. 


Note that a linear fourth-order scheme is recovered if Y^t,j,k=i ^i a %j a jk c k = Thus, the linear 
stability is identical to that encountered with four-stage fourth-order RK schemes. The value of C 3 


for which £- JiJt=1 M.jajfcC* = ^ is c 3 = 


_ !±ia 


If rational numbers are desirable, then a scheme for which the stability is nearly identical to the 
linear fourth-order case is c 3 = or in Butcher form 
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for which the 2N-storage array becomes 
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For this scheme, £?,j,jt=i b t a tJ a }k c k = 29296^07218 ~ 2 sM' Note ^ at this scheme is very close to the 
embedded four-stage third-order RK pair RK[4,3(2)]-3N proposed by Sharp et al. [8] (which is not a 
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2N-storage scheme): 
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for which ka {j a 3k c k = ~ 27^4 ■ 

For linear problems, the optimal RK[4,3(2)]-2N scheme is the fourth-order case c 3 = — f-*-. 
The optimization of the accuracy for nonlinear problems is more difficult because the optimization 
is problem dependent. We can, however, use heuristic arguments to identify certain schemes as less 
desirable. Verner [9] and Prince et al. [4] cite several theoretical considerations that should be used 
in determining desirable high-order RK schemes and schemes with embedding. Those theoretical 
considerations that are relevant to this work are 


I Each intermediate time level (c,-, i = 1,4) should be in the interval [0,1] to control the effect of 
rapidly changing derivatives. 

II To minimize roundoff error, the c, , z = 1,4 should be reasonably distinct to avoid large 6, an d 
a t J values. 

III The weights of the bj,j = 1,4 should be positive. 

IV Coefficients should incorporate rational numbers that require a small number of digits. 

V The leading-order truncation terms ||r ,+1 || should be small. 

VI The leading-order truncation terms of the embedded scheme ||f p+1 || should dominate its error. 

VII None of the leading-order truncation terms of the embedded scheme ||f p+1 || should vanish, 
which ensures that each will contribute to the local error estimate 6„+i. 


The first four conditions are relevant to the accuracy of any RK scheme. The last three pertain to 
embedded schemes. 


In the RK[4,3(2)]-2N family of schemes, no values of 0 < c 3 < 1 exist for which 0 < h, < 1 for 
all values of j = 1,4. Constraint I and IV, therefore, can not be satisfied simultaneously. For reasons 

^ | 3 /$ 

of stability and accuracy, all practical RK[4,3(2)]-2N schemes will have values of c 3 « — The 
coefficient matrices bj, and Cj are well behaved for these values of c 3 , and the truncations terms 
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r ,+1 and f p+1 have satisfactory properties. The coefficient matrices become ill-conditioned near the 
singular values c 3 « |, |, 1, and Restrictive stability domains preclude all these values 

for c 3 , except the point c 3 — ► |. Values of c 3 near § should not be used, in spite of the acceptable 
stability domain. 


Section 4: Time-Step Control 

For many systems of ODE’s that result from the semidiscretization of PDE’s, the temporal error 
does not need to be monitored at each time step because the maximum stable time step yields 
reasonable levels of temporal accuracy. Under these circumstances, the prominent concern in choosing 
a time-advancement scheme is the efficiency of the scheme. (Efficiency is defined as the absolute size 
of the stability envelope relative to the number of stages of the scheme.) Carpenter et al. [6] 
demonstrated that the efficiency of the RK[4,3]-2N schemes was as good or better than the RK[3,3]- 
2N schemes. The required work per time step is increased by one third, but the increased stability 
domain more than offsets the increased computational cost per step. This result is consistent with 
those obtained by Sharp et al. [8] for other high-order RK schemes. The embedded RK[4,3(2)]-2N 
schemes have the same stability envelope as the RK[4,3]-2N scheme and will, therefore, exhibit the 
same stability advantage over the RK[3,3]-2N schemes. 

Many nonstiff equations exist for which the time step must be determined by the solution error 
and not by stability considerations. Embedded RK schemes provide a means of adjusting the time 
step during the calculation to achieve a desired temporal error level. The quantity 6 n+1 = U 1 — U is 
the local error at time n + 1 in the pth- order formula. Given <5 n+1 , the widely used formula quoted 

by Hull et al. [10] h n+l = k h n {p^L^’ Can be USed t0 ad j ust the time step t0 contro1 the 
error per step, where e is the solution error tolerance. (For the schemes proposed here, p = 2 and 

k = 0.95 is a satisfactory constant.) 

An advantage of the RK[4,3(2)]-2N schemes is that 6 n+1 = U 1 - U 1 is available without the 
additional expense of computing it and it does not require any additional storage. Specifically, the 
estimate <$ n+1 = U 1 - U 1 = B 4 dU 4 . This results from the fact that the scheme is second- and 
third-order accurate after the third and fourth stages, respectively. The difference between the two 
solutions must be the leading-order error term in the second-order formula (assuming the scheme 
satisfies condition VI listed above). 

A disadvantage is that the time step can be adjusted only after the error has been committed 
because of the 2N-storage constraint. For conventional embedded RK schemes, if the prescribed error 
tolerance is exceeded, then the entire step is repeated with a smaller time step until the specified 
tolerance is met. This option is not available for the 2N-storage scheme because the original solution 
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vector at time level n is overwritten by the intermediate solution vectors. The subsequent time step 
can be adjusted, but the error incurred during the “failed” step is accumulated into the solution 
vector for all time. In practice, adjustment of the error tolerance to a lower level eliminates this 
problem at the expense of more computational cost. 


Section 5: Accuracy and Efficiency of RK [4,3(2)] Embedded RK Schemes 


Three test problems are used to compare the new embedded RK[4,3(2)]-2N schemes with error 
control. The first is a nonlinear ODE used by Dormand et al. [3] to test the accuracy of various RK 
schemes. The ODE is defined by y' = y cos(x), t/(0) = 1 on the interval 0 < x < 20, with 
the exact solution y(x ) = exp sin M. Figure 2 shows a convergence study for the following schemes: 
the second-order RK[3,2]-2N (the embedded scheme in RK[4,3(2)]-2N), Williamson’s [5] RK[3,3]- 
2N, the embedded RK[4,3(2)]-2N(a) (c 3 = the embedded RK[4,3(2)]-2N(b) (c 3 = ^), 

and the classical fourth-order RK[4,4]-3N scheme. All schemes approach the exact solution at their 
theoretical rate. Note that the Williamson RK[3,3]-2N scheme is nearly indistinguishable from the 
two embedded RK[4,3(2)]-2N(a) and RK[4,3(2)]-2N(b) schemes. (Williamson claims the RK[3,3]-2N 
scheme is optimal in terms of error.) This study verifies the nonlinear accuracy of the newly developed 
2N-storage schemes for ODE’s. Although the third-order scheme exhibits comparable accuracy, the 
Williamson RK[3,3]-2N scheme requires one-third fewer function evaluations than the RK[4,3(2)]-2N 
schemes but has a considerably smaller stability envelope. 


The second problem is from the class D orbit equations used by Hull et al. [10] to test the 
accuracy and efficiency of ODE solvers. The problem is defined by 


with r 2 = y\ + j/|. 


dt 

dt * 4 ’ 

m. — 

dt r * ’ 

2/i (0) = 

1 - e, 
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y<i(o) = 


The exact solution is 



yi = cos(u) - e, 

2/2 = 

\/l — e 2 sin(i 

= zlkgL 

l-ccos(u)’ 

2/4 

y/l—e 2 cos(u) 
1— ecos(u) 


with u — e sin(u) — t = 0. The eccentricity of the orbit is governed by e and becomes singular for 
values of t — ♦ 1 . A value of e = 0.9 provides a severe test of an embedded scheme’s error control 
capabilities. 


We begin by testing the error control features in the 2N-storage scheme. The conventional RK 
schemes reintegrates a rejected time step by using the solution information stored at time level n; the 
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2N-storage schemes do not have this capability. The effects of this error accumulation on the long- 
time accuracy of the solution was tested by comparing the RK[4,3(2)]-2N(a) scheme in low-storage 
and conventional RK format. In the conventional format, the rejected time steps were reintegrated, 
to provide a direct comparison between the two approaches. The results on the orbit problem indicate 
that the accumulation of error could be controlled by decreasing the constant k in the error control 
expression. For values of k < 0.9, the percentage of rejected steps becomes insignificant, and both 
methods give the same accuracy and efficiency. 

As a final test on the orbit problem, the new embedded RK[4,3(2)]-2N(a) scheme was compared 
with the embedded RK[4,3]-3N scheme reported by Sharp et al. [8] Figure 3 shows a logarithmic 
comparison of the error versus the number of time steps for each method. Both schemes have 
comparable absolute stability envelopes, and the efficiency for a given accuracy is slightly better for 
the new 2N-storage scheme. 

The last problem is the solution of the linear hyperbolic equation defined by 


u t + u x = 0 , 0 < ® < l,f > 0 

( 4 ) 

u(0, t) = sin27r(— t), t> 0 

( 5 ) 

u(x,0) = sin27r(x), 0<x<l 

(6) 


The exact solution is 


u(x,t ) = sin27r(x — t), 0<®<l,f>0 

and is a model for the class of problems that the embedded RK[4,3(2)]-2N schemes were developed 
to integrate. 

Equations (4) through (6) are solved with four different embedded RK[4,3(2)]-2N schemes; specif- 
ically, the cases c 3 = — f-*, c 3 = |||, c 3 = and c 3 = In all cases, the spatial operator used 
is the sixth-order compact scheme developed by Carpenter et al. [11] and shown to be formally sixth- 
order accurate. The physical boundary condition is imposed by solving the differentiated boundary 
condition on the boundary with the RK procedure. This technique was shown by Carpenter et al. 
[12] to yield a fourth-order temporally accurate procedure. Specifically, the boundary condition is 
d?u(0,t)/dt 3 = g"'(t), where g is the physical boundary condition at the inflow plane. The CFL’s 
that govern the stability of the hyperbolic problem range from y/2 for c 3 = — If-*- to 1.16 for c 3 = 

After grid refinement with a vanishingly small CFL, all schemes recover the theoretical spatial 
sixth-order accuracy. The leading-order error terms for values of the CFL near the theoretical max- 
imum are dominated by the temporal error components. On a specific grid, temporal refinement 
showed third-order temporal accuracy. Fourth-order temporal accuracy was obtained for the specific 
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value c 3 = — Table 1 shows the results from a grid-refinement study performed with a CFL 
of 1. Only the cases c 3 = and c 3 = ^ are shown because they bracket the behavior of the 

other two schemes. 


Table 1. Grid Refinement for Embedded RK[4,3(2)]-2N Schemes with CFL = 1 




c 3 = 

i±iZE 
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62 

1QO 


Grid 

logL 2 
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log L 0 0 
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log L 2 
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log Loo 
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-4.888 

3.96 

-3.931 

3.01 

-3.745 

3.09 

-3.786 

3.00 

161 

-6.092 

4.00 

-4.836 

3.00 

-4.660 

3.04 

-4.689 

3.00 

321 

-7.296 

4.00 

-5.739 

3.00 

-5.567 

3.01 

-5.592 

3.00 

641 

-8.501 

4.00 

-6.642 

3.00 

-6.472 

3.01 

-6.495 

3.00 


The log L 2 column represents the logarithm base 10 of the L 2 solution error, and the column 
represents the maximal error in the solution as calculated by the embedded scheme. For the value 
c 3 = 1+ y^ , the embedded scheme overpredicts the solution error on this linear problem. For the 


values c 3 7^ a direct correlation between the solution error and the predicted error from the 


embedded scheme. 


Table 2 shows the results from a temporal refinement study on a grid of 161 points. Two values 
of c 3 are used to show the trends of the study. Fourth-order temporal accuracy was obtained for the 
specific value c 3 = + Y * . Third-order temporal accuracy was observed for c 3 ^ 


Table 2. Temporal Refinement of Embedded RK[4,3(2)]-2N Schemes 




C3 = 

~i±sr 

3 



C3 = 

62 

100 


CFL 

logL 2 

Rate 

log Loo 

Rate 

log L 2 

Rate 

log Loo 

Rate 

1 

1 

1 

t 

? 

16 

1 

32 

-6.169 

-7.353 

3.93 

-4.836 

-5.739 

3.00 

-4.751 

-5.653 

3.00 

-4.689 

-5.593 

3.00 

-7.733 

1.26 

-6.642 

3.01 

-6.543 

2.96 

-6.500 

3.00 

-7.731 

- 

-7.546 

2.99 

-7.331 

2.62 

-7.399 

3.00 

-7.730 

- 

-8.448 

3.00 

-7.677 

1.15 

-8.301 

3.00 

-7.730 

- 

-9.352 

3.00 

-7.724 

0.16 

-9.205 

3.00 


Note that the solution error asymptotes to a value dictated by the spatial error terms; the embedded 
error converges at a rate of at least three (independent of the CFL used). This study shows that for 
the cases in which c 3 ^ 1+ ^ - temporal error can be controlled by monitoring the embedded error. 
When the embedded scheme with a value c 3 = — is used on linear problems, the temporal error 
will be overpredicted. 
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These three test problems demonstrate the efficacy of the embedded RK[4,3(2)]-2N schemes for 
linear and nonlinear ODE’s for which time control is important. The behavior of the new schemes 
is similar to other embedded RK schemes of comparable order that exist in the literature. If the 
overriding concern in the choice of integrator is the reduction of storage and temporal error control 
is necessary, then the newly developed embedded RK[4,3(2)]-2N schemes are clearly advantageous 
over existing 2N-storage schemes. 


Conclusions 

A class of new four-stage third-order RK[4,3(2)]-2N Runge-Kutta (RK) schemes are derived that 
require only two storage locations. The class has a three-stage second-order 2N-storage RK scheme 
embedded within the first three stages. A comparison of the second- and third-order solutions can 
give an estimate of the temporal error at each time step. The subsequent time step can then be 
adjusted to achieve a desired error control. A particular scheme is identified that has the desirable 
efficiency characteristics for hyperbolic and parabolic initial (boundary) value problems. In the 
inviscid and viscous limits, this new RK[4,3(2)]-2N storage scheme has comparable accuracy for a 
given step size and has a larger allowable stability domain than the RK[3,3]-2N scheme advocated 
by Williamson. The absolute stability of the new schemes is comparable to that achieved with the 
classical four-stage fourth-order RK scheme. Numerical tests are presented that verify these results 
on nonlinear ordinary differential equations (ODE’s) and linear hyperbolic equations. 
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FIGURE 1. CFL dependence on parameter ^ for RK[4,3] schemes where a = ]Cfj,fc=i ^i a ijajkCk- 
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FIGURE 2. Comparison of convergence between RK[4,3(2)]-2N schemes and conventional RK[3,3]- 
2N and RK[4,4]-3N schemes. Problem is ODE defined by y' = y cos(x). 
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